Chapter 9 – Emerging Biophysics Techniques 399
creatures. But a key feature of bacterial chemotaxis is that different proteins in the pathway
can be monitored using fluorescent protein labeling strategies (see Chapter 7) coupled with
advanced single-molecule localization microscopy techniques (see Chapter 4) to monitor
the spatial distribution of each component in real time as a function of perturbation in the
external chemoattractant concentration, which enables systems biophysics models of the
whole-cell level process of bacterial chemotaxis to be developed.
9.2.2 MOLECULAR NETWORKS
Many complex systems, both biological and nonbiological, can be represented as networks,
and these share several common features. Here, the components of the system feature as
nodes, while the interactions between the components are manifested as edges that link nodes.
There also exist motifs in networks, which are commonly occurring subdomain patterns
found across many different networks, for example, including feedback loops. Modularity
is therefore also common to these motifs, in that a network can be seen to be composed
of different modular units of motifs. Also, many networks in biology tend to be scale-free
networks. This means that their degree of distribution follows a power law such that the
fraction P(k) of nodes in the network having k connections to other nodes satisfies
(9.1)
P k
Ak
( ) =
−γ
where
γ is usually in the range ~2–3
A is a normalization constant ensuring that the sum of all P values is exactly 1
Molecular networks allow regulation of processes at the cost of some redundancy in the
system, but also impart robustness to noise. The most detailed type of molecular network
involves metabolic reactions, since these involve not only reactions, substrates, and products,
but also enzymes that catalyze the reactions.
The Barabási–Albert model is an algorithm for generating random scale-free networks.
It operates by generating new edges at each node in an initial system by a method of prob
abilistic attachment. It is valuable here in the context of creating a synthetic, controlled net
work that has scale-free properties but which is a more reduced version of real, complex
biological network. Thus, it can be used to develop general analytical methods for investi
gating scale-free network properties. Of key importance here is the robust identification of
genuine nodes in a real network. There are several node clustering algorithms available, for
example, the k-means algorithm alluded to briefly previously in the context of identifying
different Förster resonance energy transfer (FRET) states in fluorescence microscopy (see
Chapter 4) and to clustering of images of the same subclass in principal component analysis
(PCA) (see Chapter 8).
The general k-means clustering algorithm functions to output k mean clusters from a data
set of n points, such that k < n. It is structured as follows:
1 Initialize by randomly generating k initial clusters, each with k associated mean values,
from the data set where k is usually relatively small compared to n.
2 k clusters are created by associating each data point with the nearest mean from
a cluster. This can often be represented visually using partitions between the data
points on a Voronoi diagram. This is mathematically equivalent to assigning each data
point to the cluster whose mean value results in the minimum within-cluster sum of
squares value.
3 After partitioning, the data points then calculate the new centroid value from each of
the k clusters.
4 Iterate steps 2 and 3 until convergence. At this stage, rejection/acceptance criteria
can also be applied on putative clusters (e.g., to insist that to be within a given cluster